13 research outputs found

    Domain-specific lexicon generation for emotion detection from text.

    Get PDF
    Emotions play a key role in effective and successful human communication. Text is popularly used on the internet and social media websites to express and share emotions, feelings and sentiments. However useful applications and services built to understand emotions from text are limited in effectiveness due to reliance on general purpose emotion lexicons that have static vocabulary and sentiment lexicons that can only interpret emotions coarsely. Thus emotion detection from text calls for methods and knowledge resources that can deal with challenges such as dynamic and informal vocabulary, domain-level variations in emotional expressions and other linguistic nuances. In this thesis we demonstrate how labelled (e.g. blogs, news headlines) and weakly-labelled (e.g. tweets) emotional documents can be harnessed to learn word-emotion lexicons that can account for dynamic and domain-specific emotional vocabulary. We model the characteristics of realworld emotional documents to propose a generative mixture model, which iteratively estimates the language models that best describe the emotional documents using expectation maximization (EM). The proposed mixture model has the ability to model both emotionally charged words and emotion-neutral words. We then generate a word-emotion lexicon using the mixture model to quantify word-emotion associations in the form of a probability vectors. Secondly we introduce novel feature extraction methods to utilize the emotion rich knowledge being captured by our word-emotion lexicon. The extracted features are used to classify text into emotion classes using machine learning. Further we also propose hybrid text representations for emotion classification that use the knowledge of lexicon based features in conjunction with other representations such as n-grams, part-of-speech and sentiment information. Thirdly we propose two different methods which jointly use an emotion-labelled corpus of tweets and emotion-sentiment mapping proposed in psychology to learn word-level numerical quantification of sentiment strengths over a positive to negative spectrum. Finally we evaluate all the proposed methods in this thesis through a variety of emotion detection and sentiment analysis tasks on benchmark data sets covering domains from blogs to news articles to tweets and incident reports

    Opinion context extraction for aspect sentiment analysis.

    Get PDF
    Sentiment analysis is the computational study of opinionated text and is becoming increasing important to online commercial applications. However, the majority of current approaches determine sentiment by attempting to detect the overall polarity of a sentence, paragraph, or text window, but without any knowledge about the entities mentioned (e.g. restaurant) and their aspects (e.g. price). Aspect-level sentiment analysis of customer feedback data when done accurately can be leveraged to understand strong and weak performance points of businesses and services, and can also support the formulation of critical action steps to improve performance. In this paper we focus on aspect-level sentiment classification, studying the role of opinion context extraction for a given aspect and the extent to which traditional and neural sentiment classifiers benefit when trained using the opinion context text. We propose four methods to aspect context extraction using lexical, syntactic and sentiment co-occurrence knowledge. Further, we evaluate the usefulness of the opinion contexts for aspect-sentiment analysis. Our experiments on benchmark data sets from SemEval and a real-world dataset from the insurance domain suggests that extracting the right opinion context is effective in improving classification performance.Specifically combining syntactical features with sentiment co-occurrence knowledge leads to the best aspect-sentiment classification performance

    Lexicon based feature extraction for emotion text classification.

    Get PDF
    General Purpose Emotion Lexicons (GPELs) that associate words with emotion categories remain a valuable resource for emotion analysis of text. However the static and formal nature of their vocabularies make them inadequate for extracting effective features for document representation, in domains that are inherently dynamic in nature (e.g. Social Media). This calls for lexicons that are not only adaptive to the lexical variations in a domain but also provide finer-grained quantitative estimates to accurately capture word-emotion associations. In this paper we extend prior work on domain specific emotion lexicon (DSEL) generation and apply it for emotion feature extraction. We demonstrate how our generative unigram mixture model (UMM) based DSEL learnt by harnessing labelled (blogs, news headlines and incident reports) and weakly-labelled (tweets) emotion text can be used to extract effective features for emotion classification. Our results confirm that the features derived using the proposed lexicon outperform those from state-of-the-art lexicons learnt using supervised Latent Dirichlet Allocation (sLDA) and Point-Wise Mutual Information (PMI). Further the proposed lexicon features also outperform state-of-the-art features derived using a combination of n-grams, part-of-speech information and sentiment lexicons

    Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generation

    Full text link
    Counterspeech has been demonstrated to be an efficacious approach for combating hate speech. While various conventional and controlled approaches have been studied in recent years to generate counterspeech, a counterspeech with a certain intent may not be sufficient in every scenario. Due to the complex and multifaceted nature of hate speech, utilizing multiple forms of counter-narratives with varying intents may be advantageous in different circumstances. In this paper, we explore intent-conditioned counterspeech generation. At first, we develop IntentCONAN, a diversified intent-specific counterspeech dataset with 6831 counterspeeches conditioned on five intents, i.e., informative, denouncing, question, positive, and humour. Subsequently, we propose QUARC, a two-stage framework for intent-conditioned counterspeech generation. QUARC leverages vector-quantized representations learned for each intent category along with PerFuMe, a novel fusion module to incorporate intent-specific information into the model. Our evaluation demonstrates that QUARC outperforms several baselines by an average of 10% across evaluation metrics. An extensive human evaluation supplements our hypothesis of better and more appropriate responses than comparative systems.Comment: ACL 202

    Context extraction for aspect-based sentiment analytics: combining syntactic, lexical and sentiment knowledge.

    Get PDF
    Aspect-level sentiment analysis of customer feedback data when done accurately can be leveraged to understand strong and weak performance points of businesses and services and also formulate critical action steps to improve their performance. In this work we focus on aspect-level sentiment classification studying the role of opinion context extraction for a given aspect and the extent to which traditional and neural sentiment classifiers benefit when trained using the opinion context text. We introduce a novel method that combines lexical, syntactical and sentiment knowledge effectively to extract opinion context for aspects. Thereafter we validate the quality of the opinion contexts extracted with human judgments using the BLEU score. Further we evaluate the usefulness of the opinion contexts for aspect-sentiment analysis. Our experiments on benchmark data sets from SemEval and a real-world dataset from the insurance domain suggests that extracting the right opinion context combining syntactical with sentiment co-occurrence knowledge leads to the best aspect-sentiment classification performance. From a commercial point of view, accurate aspect extraction, provides an elegant means to identify 'pain-points' in a business. Integrating our work into a commercial CX platform (https://www.sentisum.com/) is enabling the company’s clients to better understand their customer opinions

    Emotion-aware polarity lexicons for Twitter sentiment analysis.

    Get PDF
    Theoretical frameworks in psychology map the relationships between emotions and sentiments. In this paper we study the role of such mapping for computational emotion detection from text (e.g. social media) with a aim to understand the usefulness of an emotion-rich corpus of documents (e.g. tweets) to learn polarity lexicons for sentiment analysis. We propose two different methods that leverage a corpus of emotion-labelled tweets to learn word-polarity lexicons. The proposed methods model the emotion corpus using a generative unigram mixture model (UMM), combined with the emotion-sentiment mapping proposed in Psychology for automated generation of word-polarity lexicons that capture emotion-rich vocabulary. We comparatively evaluate the quality of the proposed mixture model in learning emotion-aware sentiment lexicons with those generated using supervised latent dirichlet allocation (sLDA) and word-document frequency (WDF) statistics. Sentiment analysis experiments on benchmark Twitter data sets confirm the quality of our proposed lexicons. Further a comparative analysis with sLDA, WDF based emotion-aware lexicons and standard sentiment lexicons that are agnostic to emotion knowledge suggest that the proposed lexicons lead to a significantly better performance in both sentiment classification and sentiment intensity prediction tasks

    Predicting Emotional Reaction in Social Networks

    Get PDF
    International audienceOnline content has shifted from static and document-oriented to dy- namic and discussion-oriented, leading users to spend an increasing amount of time navigating online discussions in order to participate in their social network. Recent work on emotional contagion in social networks has shown that information is not neutral and affects its receiver. In this work, we present an approach to detect the emotional impact of news, using a dataset extracted from the Face- book pages of a major news provider. The results of our approach significantly outperform our selected baselines

    Symmetric Language-Aware Aspects for Modular Code Generators

    Get PDF
    General-purpose emotion lexicons (GPELs) that associate words with emotion categories remain a valuable resource for emotion detection. However, the static and formal nature of their vocabularies make them an inadequate resource for detecting emotions in domains that are inherently dynamic in nature. This calls for lexicons that are not only adaptive to the lexical variations in a domain but which also provide finer-grained quantitative estimates to accurately capture word-emotion associations. In this article, the authors demonstrate how to harness labeled emotion text (such as blogs and news headlines) and weakly labeled emotion text (such as tweets) to learn a word-emotion association lexicon by jointly modeling emotionality and neutrality of words using a generative unigram mixture model (UMM). Empirical evaluation confirms that UMM generated emotion language models (topics) have significantly lower perplexity compared to those from state-of-the-art generative models like supervised Latent Dirichlet Allocation (sLDA). Further emotion detection tasks involving word-emotion classification and document-emotion ranking confirm that the UMM lexicon significantly out performs GPELs and also state-of-the-art domain specific lexicons
    corecore